Welcome to the Time and Emotion Minimum Viable Product
We will walk through the steps to parse and aggregate some of the new timesheets that we suggest knowlege workers should use to improve their lot.
BEFORE running the code you need to find the location that this notebook is stored to and then create 2 subfolders: logs and html
Copy the timesheet examples to the logs folder.
Copy the html snippets to the html folder.
In [7]:
import os
import pandas as pd
import webbrowser
DATA_IN = 'logs'
HTML_IN = 'html'
HEADER = 'html/chord_header.html'
FOOTER = 'html/chord_footer.html'
FILE_OUT = 'html/chord_gen.html'
KEY_ERROR = 'Bad Key reported'
START_COL = 8
FINISH_COL = 9
COLUMN_NAMES = ['Priority', 'Description', 'Deadline', 'StartTime', 'EndTime', 'Duration', 'Project', 'Filler_1',
'StartState', 'FinishState', 'Learn_Apply', 'Attention', 'Source', 'Notes', 'Reflection']
SHEET_NAMES = ('Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday')
FLOW_STATES = ['Annoyed', 'Anxiety', 'Attentive', 'Bored', 'Flow', 'Panic', 'Puzzled', 'Relaxed']
Lets Parse the files
Use the location specified in DATA_IN, above, for the log files
In [8]:
df_total = pd.DataFrame(columns=COLUMN_NAMES)
for root_folder, dirs, files in os.walk(DATA_IN):
for file_name in files:
if '~' != file_name[0]:
xlfile = pd.ExcelFile(os.path.join(DATA_IN, file_name))
sheet_names = xlfile.sheet_names
for sheet_name in sheet_names:
if sheet_name in SHEET_NAMES:
df = xlfile.parse(sheet_name)
if (len(df.columns)) != 15:
print '% s %s %d' % (file_name, sheet_name, len(df.columns))
df.columns = COLUMN_NAMES
# TODO ensure the datetime columns are the correct format
# TODO calculate the timestamps using the filename and appropriate times
try:
df_total = pd.concat([df_total, df[(df.ix[:, 8].isin(FLOW_STATES) & df.ix[:, 9]
.isin(FLOW_STATES))]])
except KeyError:
print '%s %s %s ' % (KEY_ERROR, file_name, sheet_name)
s_temp = df_total.groupby([df_total.FinishState, df_total.StartState], as_index=False).size()
df_chord = s_temp.unstack().fillna(0)
# Add any missing columns
for col_name in FLOW_STATES:
if col_name not in df_chord.columns:
df_chord.insert(FLOW_STATES.index(col_name), col_name, 0)
print df_chord
Build the chord diagram
In [9]:
s_data = str((df_chord.as_matrix()))
s_data = s_data.replace('.', ',')
s_data = s_data.replace(',]', '],', 7)
s_data = s_data.replace(',]]', ']]')
content = open(HEADER).read()
content += 'var matrix = %s;' % s_data
content += open(FOOTER).read()
fout = open(FILE_OUT, 'w')
fout.write(content)
fout.flush()
fout.close()
path = os.path.abspath(FILE_OUT)
url = 'file://' + path
with open(path, 'w') as f:
f.write(content)
webbrowser.open(url)